Skip to content

feat(0.2): parity rubric + baseline scores (Track 0.1)#135

Open
pmclSF wants to merge 2 commits intomainfrom
feat/0.2-parity-rubric
Open

feat(0.2): parity rubric + baseline scores (Track 0.1)#135
pmclSF wants to merge 2 commits intomainfrom
feat/0.2-parity-rubric

Conversation

@pmclSF
Copy link
Copy Markdown
Owner

@pmclSF pmclSF commented May 2, 2026

Summary

Track 0.1 from the 0.2.0 parity-gated release plan. Encodes the 12-area × 17-axis maturity rubric as data so Track 0.2 (the `parity-gate` Go binary) can consume it.

Two files, both pure data:

  • `docs/release/parity/rubric.yaml` — structural source of truth
  • `docs/release/parity/scores.yaml` — current per-cell scores (204 cells, one-line evidence each)

What's in the rubric

  • 3 pillars with priorities for 0.2.0: Gate (1°), Understand (2°), Align (3° soft)
  • 12 functional areas with pillar + tier + surface
  • 17 axes — 7 product (P1-P7), 7 engineering (E1-E7), 3 UI/visual (V1-V3) — each with anchored level definitions for 1, 3, 5
  • Per-pillar floors: Gate ≥ 4, Understand ≥ 3, Align ≥ 3 (soft warn-only)
  • 7 cross-cutting uniformity gates that catch unevenness across detectors, frameworks, commands, and outputs

Baseline reality check

The current scores are honest about the floor: every pillar has at least one cell at score 2, which matches the launch-readiness review's "uneven product" critique. The 0.2.0 work (Tracks 1–10) lifts those cells to clear the gate.

The V-axes (UI/visual) baseline as the lowest-scoring cluster across the codebase — most output uses ad-hoc styling. Track 10 (Visual & design system) is the cross-cutting lift that addresses this.

What's next

  • Track 0.2 — `cmd/terrain-parity-gate` Go binary that reads both files, emits the matrix + floor map, and exits non-zero when any cell drops below its pillar floor
  • Track 0.4 — wire `make pillar-parity` into CI as a hard gate

Both follow in separate PRs.

Test plan

  • Both YAMLs are syntactically valid (parses with stdlib yaml.v3 in Track 0.2's tool)
  • 12 areas × 17 axes = 204 cells, every cell has a score and evidence
  • Every area in scores.yaml has a corresponding rubric.yaml entry
  • Floor map matches the audit doc (`docs/release/0.2.x-maturity-audit.md`)

Plan link

`/Users/pzachary/.claude/plans/kind-mapping-turing.md` (Track 0.1; Track 0.5 audit doc was committed earlier in PR docs/0.2.x-maturity-audit).

🤖 Generated with Claude Code

Encodes the 12-area × 17-axis maturity rubric as data so the parity
dashboard (Track 0.2) can consume it. Two files:

  docs/release/parity/rubric.yaml — structural source of truth
    - 3 pillars (understand/align/gate) + cross-cutting distribution
    - 12 functional areas with id / name / pillar / tier / surface
    - 17 axes (7 product / 7 engineering / 3 UI-visual) with anchored
      level definitions for 1, 3, 5
    - Per-pillar floor requirements: gate ≥ 4, understand ≥ 3,
      align ≥ 3 (soft)
    - 7 cross-cutting uniformity gates (detector_shape,
      framework_depth, command_shape, doc_scaffold, renderer_tokens,
      empty_state_coverage, voice_and_tone) — these catch unevenness
      across detectors, frameworks, commands, and outputs

  docs/release/parity/scores.yaml — current per-cell scores
    - 12 × 17 = 204 cells, each scored 1–5 with one-line evidence
    - Snapshot taken against `main` at 8545f03 (post PR #131)
    - Reflects the state captured in
      `docs/release/0.2.x-maturity-audit.md`

The baseline scores are honest about the floor: every pillar
currently has at least one cell at score 2, which is exactly what
the launch-readiness review flagged as "uneven product." The 0.2.0
work (Tracks 1–10) lifts those cells to clear the gate — Gate to
≥ 4, Understand to ≥ 3, Align to ≥ 3 soft.

The V-axes (V1 visual consistency / V2 information rhythm / V3
fun-to-use polish) are baselined as the lowest-scoring axis cluster
across the codebase, which matches reality: most user-visible
output uses ad-hoc styling; Track 10 (Visual & design system)
lifts these.

Track 0.2 (parity-gate Go binary) reads both files and emits the
matrix + floor map. Track 0.4 wires `make pillar-parity` into CI as
a hard gate that rejects PRs that drop a cell below its pillar
floor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

[INFO] Terrain — Informational only

Insufficient data to assess change risk confidently.

Metric Value
Changed files 7 (1 source · 1 test)
Impacted units 1
Protection gaps 1

Coverage gaps in changed code

  • cmd/terrain-parity-gate/main.go [LOW] — main.go has no observed test coverage.
    → Add unit tests for main.go.
12 pre-existing issues on changed files
  • cmd/terrain-parity-gate/main_test.go [MED] — [untestedExport] Exported function "TestAxisOrderKey" has no linked tests in the current analysis model.
  • cmd/terrain-parity-gate/main_test.go [MED] — [untestedExport] Exported function "TestBuildReport_FailWhenBelowHardFloor" has no linked tests in the current analysis model.
  • cmd/terrain-parity-gate/main_test.go [MED] — [untestedExport] Exported function "TestBuildReport_MixedHardAndSoft" has no linked tests in the current analysis model.
  • cmd/terrain-parity-gate/main_test.go [MED] — [untestedExport] Exported function "TestBuildReport_PassWhenAllAtFloor" has no linked tests in the current analysis model.
  • cmd/terrain-parity-gate/main_test.go [MED] — [untestedExport] Exported function "TestBuildReport_SoftGateWarns" has no linked tests in the current analysis model.
  • ...and 7 more

Limitations
  • No coverage artifacts provided; protection gaps reflect missing data, not measured absence. Provide --coverage to improve accuracy.
  • Mixed test cultures reduce cross-framework optimization confidence. Consider standardizing on fewer frameworks.

Generated by Terrain · terrain pr --json for machine-readable output

Targeted Test Results

No tests selected — change affects only non-code files.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

Terrain AI Risk Review

Metric Value
AI surfaces 13
Eval scenarios 16
Impacted scenarios 0
Uncovered surfaces 13

Decision: PASS — AI surfaces are covered.

…2/0.3/0.6)

Three Track 0 deliverables in one PR:

  Track 0.2 — `cmd/terrain-parity-gate/main.go`
    Reads rubric.yaml + scores.yaml (defaults to docs/release/parity/),
    validates structure (every cell scored, scores in [1,5], pillar
    references valid), computes per-area floors + per-pillar
    verdicts, emits human-readable matrix or JSON, exits non-zero
    when any hard-gate pillar is below its floor. Soft gates (Align
    in 0.2.0) print WARN but do not fail.

    Three output modes: default matrix, --json, --floor-map (compact).
    Exit codes: 0 PASS, 1 hard-gate FAIL, 2 usage error.

    12 unit tests cover validation rejections, floor computation,
    soft-gate WARN semantics, mixed hard/soft pillar behavior, and
    a real-rubric load test that catches drift between YAML and
    Go types.

  Track 0.3 — `make pillar-parity` (+ `pillar-parity-floor`,
  `pillar-parity-json` variants) wired through Makefile. Same posture
  as `make docs-verify` — anyone can run it locally before opening a
  PR.

  Track 0.6 — CONTRIBUTING.md "Parity gate" section. Documents:
    - per-pillar floors and which are hard / soft
    - the source-of-truth split (rubric.yaml = structure,
      scores.yaml = per-cell numbers, audit doc = prose companion)
    - how a parity-lift PR updates a cell (one-line evidence + score
      change + audit doc narrative if relevant)
    - the seven uniformity gates (advisory in 0.2.0, hard in 0.2.x)

Current baseline output:

  Pillar verdict
    understand   floor=2 / required=3   FAIL  weakest=core_analyze/V1
    align        floor=2 / required=3   WARN (soft)
    gate         floor=2 / required=4   FAIL  weakest=pr_change_scoped/E2
  Overall: FAIL

This is the honest starting point. Tracks 1-10 lift cells until every
hard-gate pillar passes; that's the 0.2.0 release gate.

Track 0.4 (CI hard gate) intentionally not included in this PR. Wiring
parity-gate into CI today would fail every PR until enough cells are
lifted; the gate flips to mandatory after the parity-lifting tracks
land enough to make it useful. For now, anyone can run
`make pillar-parity` locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant